Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Given a long time series, the distance profile of a query time series computes distances between the query and every possible subsequence of a long time series. MASS (Mueen’s Algorithm for Similarity Search) is an algorithm to efficiently compute distance profile under z-normalized Euclidean distance (Mueen et al. in The fastest similarity search algorithm for time series subsequences under Euclidean distance. http://www.cs.unm.edu/~mueen/FastestSimilaritySearch.html, 2017). MASS is recognized as a useful tool in many data mining works. However, complete documentation of the increasingly efficient versions of the algorithm does not exist. In this paper, we formalize the notion of a distance profile, describe four versions of the MASS algorithm, show several extensions of distance profiles under various operating conditions, describe how MASS improves performances of existing data mining algorithms, and finally, show utility of MASS in domains including seismology, robotics and power grids.more » « less
-
Seismic monitoring systems sift through seismograms in real-time, searching for target events, such as underground explosions. In this monitoring system, a burst of aftershocks (minor earthquakes occur after a major earthquake over days or even years) can be a source of confounding signals. Such a burst of aftershock signals can overload the human analysts of the monitoring system. To alleviate this burden at the onset of a sequence of events (e.g., aftershocks), a human analyst can label the first few of these events and start an online classifier to filter out subsequent aftershock events. We propose an online few-shot classification model FewSig for time series data for the above use case. The framework of FewSig consists of a selective model to identify the high-confidence positive events which are used for updating the models and a general classifier to label the remaining events. Our specific technique uses a %two-level decision tree selective model based on sliding DTW distance and a general classifier model based on distance metric learning with Neighborhood Component Analysis (NCA). The algorithm demonstrates surprising robustness when tested on univariate datasets from the UEA/UCR archive. Furthermore, we show two real-world earthquake events where the FewSig reduces the human effort in monitoring applications by filtering out the aftershock events.more » « less
-
Monitoring systems have hundreds or thousands of distributed sensors gathering and transmitting real-time streaming data. The early detection of events in these systems, such as an earthquake in a seismic monitoring system, is the base for essential tasks as warning generations. To detect such events is usual to compute pairwise correlation across the disparate signals generated by the sensors. Since the data sources (e.g., sensors) are spatially separated, it is essential to consider the lagged correlation between the signals. Besides, many applications require to process a specific band of frequencies depending on the event’s type, demanding a pre-processing step of filtering before computing correlations. Due to the high speed of data generation and a large number of sensors in these systems, the operations of filtering and lagged cross-correlation need to be efficient to provide real-time responses without data losses. This article proposes a technique named FilCorr that efficiently computes both operations in one single step. We achieve an order of magnitude speedup by maintaining frequency transforms over sliding windows. Our method is exact, devoid of sensitive parameters, and easily parallelizable. Besides our algorithm, we also provide a publicly available real-time system named Seisviz that employs FilCorr in its core mechanism for monitoring a seismometer network. We demonstrate that our technique is suitable for several monitoring applications as seismic signal monitoring, motion monitoring, and neural activity monitoring.more » « less
-
null (Ed.)An essential task on streaming time series data is to compute pairwise correlation across disparate signal sources to identify significant events. In many monitoring applications, such as geospatial monitoring, motion monitoring and critical infrastructure monitoring, correlation is observed at various frequency bands and temporal lags. In this paper, we consider computing filtered and lagged correlation on streaming time series data, which is challenging because the computation must be “in-sync” with the incoming stream for any detected events to be useful. We propose a technique to compute filtered and lagged correlation on streaming data efficiently by merging two individual operations: filtering and cross-correlations. We achieve an order of magnitude speed-up by maintaining frequency transforms over sliding windows. Our method is exact, devoid of sensitive parameters, and easily parallelizable. We demonstrate our technique in a seismic signal monitoring application.more » « less
An official website of the United States government
